Crowdsourced Adaptive Surveys Replication
Yamil Ricardo Velez  
Columbia University  
2024-10-30  

This repository contains the code and data needed to replicate the results from "Crowdsourced Adaptive Surveys." The project uses R for data processing, analysis, and visualization. 

File Structure

# Main Analysis Files
- replication.Rmd: The primary R Markdown file for reproducing the analysis, generating figures, and outputting tables as used in the study.
- replication.html: Compiled HTML output of the analysis.

# Data Files
- cr_issues.csv: Contains CloudResearch Connect survey data related to issue priorities.
- cr_local.csv: Contains CloudResearch Connect data related to local concerns. 
- cr_misinfo.csv: Contains CloudResearch Connect survey data about misinformation.

# Embeddings Directory
Contains files related to embedding analyses for issue and claim uniqueness.
- embedding_threshold_script/:
  - main.py: Python script for threshold-based embedding analysis.
  - main2.py: Additional script for embedding analysis.
  - mip_codes.csv: Open-ended responses and classifications in issue study.
  - misinformation_codes.csv: Open-ended responses and classifications in misinformation study.
- unique_claims.csv: Threshold analysis output for issue study.
- unique_issues.csv: Threshold analysis output for misinformation study. 

# Open Source Comparison Directory
Contains files for comparisons of open-source issue and claim coding.
- llama_analysis.csv: Llama issue coding comparisons.
- llama_analysis2.csv: Llama claim coding comparisons.
- mistral_analysis.csv: Mistral issue coding comparisons.
- mistral_analysis2.csv: Mistral claim coding comparisons.

# R Package Versions

The following R package versions were used in the analysis:

knitr: 1.45
qualtRics: 3.2.0
magrittr: 2.0.3
tidyverse: 2.0.0
hrbrthemes: 0.8.0
modelsummary: 1.1.0
forcats: 1.0.0
rio: 0.5.29

# Run the Analysis 

Open `replication.Rmd` in RStudio and knit the document to generate the analysis report, including all tables and figures.

# Contact

For any questions regarding the replication of this study or the code used, please contact Yamil Ricardo Velez at yrv2004@columbia.edu
